Overview

Dataset statistics

Number of variables 14
Number of observations 20733
Missing cells 195238
Missing cells (%) 67.3%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 2.2 MiB
Average record size in memory 112.0 B

Variable types

DateTime 1
Categorical 4
Numeric 9

Dataset

Description Sensor that returns a label identifying the activity performed by the user, accurately detected using low power signals from multiple sensors in the device. This is achieved using Google’s Activity Recognition APIs. Possible activities are: still, in_vehicle, on_bycicle, on_foot, running, tilting, walking. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

experimentId Experiment Id
userId User id
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
day day showing month(2), day(2)
label The activity name with highest accuracy
accuracy The highest accuracy for possible activities
InVehicle The value of the "in_vehicle" activity
OnBycicle The value of the "on_bycicle" activity
OnFoot The value of the "on_foot" activity
Running The value of the "running" activity
Still The value of the "still" activity
Unknown The value of the "unknown" activity
Walking The value of the "walking" activity
Tilting The value of the "tilting" activity

Alerts

experimentId has constant value "wenet" Constant
Tilting has constant value "100.0" Constant
accuracy is highly correlated with Still High correlation
OnBycicle is highly correlated with Unknown High correlation
OnFoot is highly correlated with Running and 1 other fields High correlation
Running is highly correlated with OnFoot and 2 other fields High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with OnBycicle and 2 other fields High correlation
Walking is highly correlated with OnFoot and 1 other fields High correlation
accuracy is highly correlated with Still and 1 other fields High correlation
OnFoot is highly correlated with Walking High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with accuracy and 1 other fields High correlation
Walking is highly correlated with OnFoot High correlation
accuracy is highly correlated with Still High correlation
OnBycicle is highly correlated with Unknown High correlation
OnFoot is highly correlated with Running and 1 other fields High correlation
Running is highly correlated with OnFoot and 1 other fields High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with OnBycicle and 1 other fields High correlation
Walking is highly correlated with OnFoot and 1 other fields High correlation
Unknown is highly correlated with experimentId and 1 other fields High correlation
experimentId is highly correlated with Unknown and 2 other fields High correlation
label is highly correlated with experimentId and 1 other fields High correlation
Tilting is highly correlated with Unknown and 2 other fields High correlation
userId is highly correlated with day High correlation
day is highly correlated with userId High correlation
label is highly correlated with accuracy and 6 other fields High correlation
accuracy is highly correlated with label and 5 other fields High correlation
InVehicle is highly correlated with label and 2 other fields High correlation
OnBycicle is highly correlated with label High correlation
OnFoot is highly correlated with label and 5 other fields High correlation
Running is highly correlated with OnFoot and 1 other fields High correlation
Still is highly correlated with label and 5 other fields High correlation
Unknown is highly correlated with label and 3 other fields High correlation
Walking is highly correlated with label and 4 other fields High correlation
experimentId has 12799 (61.7%) missing values Missing
userId has 12799 (61.7%) missing values Missing
day has 12799 (61.7%) missing values Missing
label has 12799 (61.7%) missing values Missing
accuracy has 12799 (61.7%) missing values Missing
InVehicle has 16280 (78.5%) missing values Missing
OnBycicle has 17042 (82.2%) missing values Missing
OnFoot has 16494 (79.6%) missing values Missing
Running has 17904 (86.4%) missing values Missing
Still has 13023 (62.8%) missing values Missing
Unknown has 15043 (72.6%) missing values Missing
Walking has 16494 (79.6%) missing values Missing
Tilting has 18963 (91.5%) missing values Missing
timestamp has unique values Unique
userId has 308 (1.5%) zeros Zeros

Reproduction

Analysis started 2022-07-04 18:02:06.118581
Analysis finished 2022-07-04 18:02:29.217307
Duration 23.1 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 20733
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 162.1 KiB
Minimum 1900-11-22 03:46:00
Maximum 1900-12-06 13:18:00
2022-07-04T20:02:29.362841 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:29.674082 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

experimentId
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Experiment Id

Distinct 1
Distinct (%) < 0.1%
Missing 12799
Missing (%) 61.7%
Memory size 162.1 KiB
wenet
7934

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 39670
Distinct characters 4
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 7934
38.3%
(Missing) 12799
61.7%

Length

2022-07-04T20:02:29.960654 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:02:30.184310 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 7934
100.0%

Most occurring characters

Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 39670
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring scripts

Value Count Frequency (%)
Latin 39670
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 39670
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

userId
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING
ZEROS

User id

Distinct 6
Distinct (%) 0.1%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 8.418326191
Minimum 0
Maximum 14
Zeros 308
Zeros (%) 1.5%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:30.335665 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 4
Q1 4
median 11
Q3 13
95-th percentile 13
Maximum 14
Range 14
Interquartile range (IQR) 9

Descriptive statistics

Standard deviation 4.721469017
Coefficient of variation (CV) 0.5608560312
Kurtosis -1.777878199
Mean 8.418326191
Median Absolute Deviation (MAD) 3
Skewness -0.1352946335
Sum 66791
Variance 22.29226968
Monotonicity Not monotonic
2022-07-04T20:02:30.514722 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
Value Count Frequency (%)
13 3619
17.5%
4 3513
16.9%
0 308
1.5%
11 239
1.2%
14 216
1.0%
1 39
0.2%
(Missing) 12799
61.7%
Value Count Frequency (%)
0 308
1.5%
1 39
0.2%
4 3513
16.9%
11 239
1.2%
13 3619
17.5%
14 216
1.0%
Value Count Frequency (%)
14 216
1.0%
13 3619
17.5%
11 239
1.2%
4 3513
16.9%
1 39
0.2%
0 308
1.5%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 15
Distinct (%) 0.2%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 1153.450718
Minimum 1122
Maximum 1206
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:30.747714 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1122
5-th percentile 1122
Q1 1125
median 1129
Q3 1202
95-th percentile 1204
Maximum 1206
Range 84
Interquartile range (IQR) 77

Descriptive statistics

Standard deviation 36.90225817
Coefficient of variation (CV) 0.0319929214
Kurtosis -1.647152659
Mean 1153.450718
Median Absolute Deviation (MAD) 6
Skewness 0.5801259479
Sum 9151478
Variance 1361.776658
Monotonicity Increasing
2022-07-04T20:02:30.953641 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
Value Count Frequency (%)
1201 814
3.9%
1130 808
3.9%
1125 748
3.6%
1124 740
3.6%
1123 737
3.6%
1202 707
3.4%
1127 603
2.9%
1204 573
2.8%
1129 466
2.2%
1122 446
2.2%
Other values (5) 1292
6.2%
(Missing) 12799
61.7%
Value Count Frequency (%)
1122 446
2.2%
1123 737
3.6%
1124 740
3.6%
1125 748
3.6%
1126 299
1.4%
1127 603
2.9%
1128 240
1.2%
1129 466
2.2%
1130 808
3.9%
1201 814
3.9%
Value Count Frequency (%)
1206 77
0.4%
1205 308
1.5%
1204 573
2.8%
1203 368
1.8%
1202 707
3.4%
1201 814
3.9%
1130 808
3.9%
1129 466
2.2%
1128 240
1.2%
1127 603
2.9%

label
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

The activity name with highest accuracy

Distinct 6
Distinct (%) 0.1%
Missing 12799
Missing (%) 61.7%
Memory size 162.1 KiB
Still
4948
Unknown
1171
Tilting
1093
OnFoot
453
InVehicle
260

Length

Max length 9
Median length 5
Mean length 5.763423242
Min length 5

Characters and Unicode

Total characters 45727
Distinct characters 20
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Tilting
2nd row Tilting
3rd row Still
4th row Still
5th row Still

Common Values

Value Count Frequency (%)
Still 4948
23.9%
Unknown 1171
5.6%
Tilting 1093
5.3%
OnFoot 453
2.2%
InVehicle 260
1.3%
OnBycicle 9
< 0.1%
(Missing) 12799
61.7%

Length

2022-07-04T20:02:31.200798 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:02:31.478852 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
still 4948
62.4%
unknown 1171
14.8%
tilting 1093
13.8%
onfoot 453
5.7%
invehicle 260
3.3%
onbycicle 9
0.1%

Most occurring characters

Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 37071
81.1%
Uppercase Letter 8656
18.9%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
l 11258
30.4%
i 7403
20.0%
t 6494
17.5%
n 5328
14.4%
o 2077
5.6%
k 1171
3.2%
w 1171
3.2%
g 1093
2.9%
e 529
1.4%
c 278
0.7%
Other values (2) 269
0.7%
Uppercase Letter
Value Count Frequency (%)
S 4948
57.2%
U 1171
13.5%
T 1093
12.6%
O 462
5.3%
F 453
5.2%
I 260
3.0%
V 260
3.0%
B 9
0.1%

Most occurring scripts

Value Count Frequency (%)
Latin 45727
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 45727
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

accuracy
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The highest accuracy for possible activities

Distinct 75
Distinct (%) 0.9%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 86.34257625
Minimum 26
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:31.756755 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 26
5-th percentile 40
Q1 86
median 99
Q3 100
95-th percentile 100
Maximum 100
Range 74
Interquartile range (IQR) 14

Descriptive statistics

Standard deviation 22.86123216
Coefficient of variation (CV) 0.26477357
Kurtosis 0.1052264047
Mean 86.34257625
Median Absolute Deviation (MAD) 1
Skewness -1.385442076
Sum 685042
Variance 522.6359357
Monotonicity Not monotonic
2022-07-04T20:02:32.041511 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 3498
16.9%
99 1195
5.8%
40 1183
5.7%
96 336
1.6%
97 254
1.2%
98 248
1.2%
93 59
0.3%
92 53
0.3%
91 51
0.2%
94 48
0.2%
Other values (65) 1009
4.9%
(Missing) 12799
61.7%
Value Count Frequency (%)
26 1
< 0.1%
27 2
< 0.1%
28 2
< 0.1%
29 5
< 0.1%
30 3
< 0.1%
31 3
< 0.1%
32 3
< 0.1%
33 3
< 0.1%
34 4
< 0.1%
35 7
< 0.1%
Value Count Frequency (%)
100 3498
16.9%
99 1195
5.8%
98 248
1.2%
97 254
1.2%
96 336
1.6%
95 41
0.2%
94 48
0.2%
93 59
0.3%
92 53
0.3%
91 51
0.2%

InVehicle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "in_vehicle" activity

Distinct 71
Distinct (%) 1.6%
Missing 16280
Missing (%) 78.5%
Infinite 0
Infinite (%) 0.0%
Mean 15.11250842
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:32.346501 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 89
Maximum 98
Range 97
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 22.18792072
Coefficient of variation (CV) 1.468182522
Kurtosis 7.270027849
Mean 15.11250842
Median Absolute Deviation (MAD) 4
Skewness 2.862919214
Sum 67296
Variance 492.303826
Monotonicity Not monotonic
2022-07-04T20:02:32.645538 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1905
9.2%
1 736
3.5%
2 284
1.4%
3 106
0.5%
4 86
0.4%
6 73
0.4%
5 69
0.3%
96 61
0.3%
23 56
0.3%
9 53
0.3%
Other values (61) 1024
4.9%
(Missing) 16280
78.5%
Value Count Frequency (%)
1 736
3.5%
2 284
1.4%
3 106
0.5%
4 86
0.4%
5 69
0.3%
6 73
0.4%
7 43
0.2%
8 48
0.2%
9 53
0.3%
10 1905
9.2%
Value Count Frequency (%)
98 20
0.1%
97 47
0.2%
96 61
0.3%
95 15
0.1%
94 15
0.1%
93 21
0.1%
92 17
0.1%
91 15
0.1%
90 10
< 0.1%
89 12
0.1%

OnBycicle
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_bycicle" activity

Distinct 50
Distinct (%) 1.4%
Missing 17042
Missing (%) 82.2%
Infinite 0
Infinite (%) 0.0%
Mean 7.695204552
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:32.954424 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 10
Maximum 98
Range 97
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 7.01975735
Coefficient of variation (CV) 0.9122249192
Kurtosis 66.4281333
Mean 7.695204552
Median Absolute Deviation (MAD) 0
Skewness 6.409381599
Sum 28403
Variance 49.27699326
Monotonicity Not monotonic
2022-07-04T20:02:33.253390 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 2012
9.7%
1 584
2.8%
2 328
1.6%
3 206
1.0%
4 128
0.6%
5 86
0.4%
6 66
0.3%
7 41
0.2%
8 37
0.2%
9 27
0.1%
Other values (40) 176
0.8%
(Missing) 17042
82.2%
Value Count Frequency (%)
1 584
2.8%
2 328
1.6%
3 206
1.0%
4 128
0.6%
5 86
0.4%
6 66
0.3%
7 41
0.2%
8 37
0.2%
9 27
0.1%
10 2012
9.7%
Value Count Frequency (%)
98 1
< 0.1%
94 1
< 0.1%
92 1
< 0.1%
88 1
< 0.1%
87 1
< 0.1%
85 1
< 0.1%
84 1
< 0.1%
83 1
< 0.1%
82 1
< 0.1%
80 1
< 0.1%

OnFoot
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_foot" activity

Distinct 68
Distinct (%) 1.6%
Missing 16494
Missing (%) 79.6%
Infinite 0
Infinite (%) 0.0%
Mean 20.844067
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:33.553352 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 6
median 10
Q3 11
95-th percentile 94
Maximum 98
Range 97
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 29.47129657
Coefficient of variation (CV) 1.413893775
Kurtosis 1.847045072
Mean 20.844067
Median Absolute Deviation (MAD) 2
Skewness 1.894742966
Sum 88358
Variance 868.5573214
Monotonicity Not monotonic
2022-07-04T20:02:34.052778 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1926
9.3%
1 501
2.4%
2 144
0.7%
4 119
0.6%
3 112
0.5%
5 98
0.5%
96 94
0.5%
6 87
0.4%
11 69
0.3%
8 64
0.3%
Other values (58) 1025
4.9%
(Missing) 16494
79.6%
Value Count Frequency (%)
1 501
2.4%
2 144
0.7%
3 112
0.5%
4 119
0.6%
5 98
0.5%
6 87
0.4%
7 62
0.3%
8 64
0.3%
9 52
0.3%
10 1926
9.3%
Value Count Frequency (%)
98 32
0.2%
97 52
0.3%
96 94
0.5%
95 31
0.1%
94 33
0.2%
93 40
0.2%
92 41
0.2%
91 37
0.2%
90 29
0.1%
89 26
0.1%

Running
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "running" activity

Distinct 29
Distinct (%) 1.0%
Missing 17904
Missing (%) 86.4%
Infinite 0
Infinite (%) 0.0%
Mean 8.675503712
Minimum 1
Maximum 91
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:34.328977 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 10
median 10
Q3 10
95-th percentile 10
Maximum 91
Range 90
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 5.688088761
Coefficient of variation (CV) 0.6556493951
Kurtosis 83.45415677
Mean 8.675503712
Median Absolute Deviation (MAD) 0
Skewness 6.91572269
Sum 24543
Variance 32.35435375
Monotonicity Not monotonic
2022-07-04T20:02:34.549351 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
Value Count Frequency (%)
10 2215
10.7%
1 345
1.7%
2 106
0.5%
3 62
0.3%
4 29
0.1%
5 19
0.1%
6 13
0.1%
7 7
< 0.1%
8 4
< 0.1%
11 3
< 0.1%
Other values (19) 26
0.1%
(Missing) 17904
86.4%
Value Count Frequency (%)
1 345
1.7%
2 106
0.5%
3 62
0.3%
4 29
0.1%
5 19
0.1%
6 13
0.1%
7 7
< 0.1%
8 4
< 0.1%
9 1
< 0.1%
10 2215
10.7%
Value Count Frequency (%)
91 1
< 0.1%
88 1
< 0.1%
85 1
< 0.1%
81 1
< 0.1%
76 1
< 0.1%
74 1
< 0.1%
69 1
< 0.1%
68 1
< 0.1%
63 1
< 0.1%
57 2
< 0.1%

Still
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "still" activity

Distinct 91
Distinct (%) 1.2%
Missing 13023
Missing (%) 62.8%
Infinite 0
Infinite (%) 0.0%
Mean 68.32594034
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:34.817310 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 3
Q1 10
median 98
Q3 100
95-th percentile 100
Maximum 100
Range 99
Interquartile range (IQR) 90

Descriptive statistics

Standard deviation 40.57475563
Coefficient of variation (CV) 0.5938411594
Kurtosis -1.383513007
Mean 68.32594034
Median Absolute Deviation (MAD) 2
Skewness -0.6903562138
Sum 526793
Variance 1646.310794
Monotonicity Not monotonic
2022-07-04T20:02:35.109025 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 2560
12.3%
10 1673
8.1%
99 1281
6.2%
98 240
1.2%
96 237
1.1%
1 237
1.1%
97 194
0.9%
2 112
0.5%
3 52
0.3%
44 50
0.2%
Other values (81) 1074
5.2%
(Missing) 13023
62.8%
Value Count Frequency (%)
1 237
1.1%
2 112
0.5%
3 52
0.3%
4 31
0.1%
5 22
0.1%
6 33
0.2%
7 10
< 0.1%
8 15
0.1%
9 11
0.1%
10 1673
8.1%
Value Count Frequency (%)
100 2560
12.3%
99 1281
6.2%
98 240
1.2%
97 194
0.9%
96 237
1.1%
95 15
0.1%
94 13
0.1%
93 15
0.1%
92 11
0.1%
91 11
0.1%

Unknown
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "unknown" activity

Distinct 5
Distinct (%) 0.1%
Missing 15043
Missing (%) 72.6%
Memory size 162.1 KiB
1.0
2381
40.0
1726
2.0
1296
3.0
280
4.0
7

Length

Max length 4
Median length 3
Mean length 3.303339192
Min length 3

Characters and Unicode

Total characters 18796
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 40.0
2nd row 40.0
3rd row 1.0
4th row 1.0
5th row 1.0

Common Values

Value Count Frequency (%)
1.0 2381
11.5%
40.0 1726
8.3%
2.0 1296
6.3%
3.0 280
1.4%
4.0 7
< 0.1%
(Missing) 15043
72.6%

Length

2022-07-04T20:02:35.377870 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:02:35.627777 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
1.0 2381
41.8%
40.0 1726
30.3%
2.0 1296
22.8%
3.0 280
4.9%
4.0 7
0.1%

Most occurring characters

Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Most occurring categories

Value Count Frequency (%)
Decimal Number 13106
69.7%
Other Punctuation 5690
30.3%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 7416
56.6%
1 2381
18.2%
4 1733
13.2%
2 1296
9.9%
3 280
2.1%
Other Punctuation
Value Count Frequency (%)
. 5690
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 18796
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Most occurring blocks

Value Count Frequency (%)
ASCII 18796
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Walking
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "walking" activity

Distinct 61
Distinct (%) 1.4%
Missing 16494
Missing (%) 79.6%
Infinite 0
Infinite (%) 0.0%
Mean 20.68766218
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:02:35.894573 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 6
median 10
Q3 11
95-th percentile 94
Maximum 98
Range 97
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 29.33925685
Coefficient of variation (CV) 1.418200693
Kurtosis 1.940844507
Mean 20.68766218
Median Absolute Deviation (MAD) 2
Skewness 1.918739928
Sum 87695
Variance 860.7919926
Monotonicity Not monotonic
2022-07-04T20:02:36.191755 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1927
9.3%
1 501
2.4%
2 144
0.7%
4 119
0.6%
3 112
0.5%
5 98
0.5%
96 94
0.5%
6 87
0.4%
11 70
0.3%
8 65
0.3%
Other values (51) 1022
4.9%
(Missing) 16494
79.6%
Value Count Frequency (%)
1 501
2.4%
2 144
0.7%
3 112
0.5%
4 119
0.6%
5 98
0.5%
6 87
0.4%
7 63
0.3%
8 65
0.3%
9 53
0.3%
10 1927
9.3%
Value Count Frequency (%)
98 32
0.2%
97 52
0.3%
96 94
0.5%
95 31
0.1%
94 33
0.2%
93 40
0.2%
92 41
0.2%
91 36
0.2%
90 29
0.1%
89 26
0.1%

Tilting
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

The value of the "tilting" activity

Distinct 1
Distinct (%) 0.1%
Missing 18963
Missing (%) 91.5%
Memory size 162.1 KiB
100.0
1770

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 8850
Distinct characters 3
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 100.0
2nd row 100.0
3rd row 100.0
4th row 100.0
5th row 100.0

Common Values

Value Count Frequency (%)
100.0 1770
8.5%
(Missing) 18963
91.5%

Length

2022-07-04T20:02:36.454148 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:02:36.676341 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
100.0 1770
100.0%

Most occurring characters

Value Count Frequency (%)
0 5310
60.0%
1 1770
20.0%
. 1770
20.0%

Most occurring categories

Value Count Frequency (%)
Decimal Number 7080
80.0%
Other Punctuation 1770
20.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 5310
75.0%
1 1770
25.0%
Other Punctuation
Value Count Frequency (%)
. 1770
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 8850
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 5310
60.0%
1 1770
20.0%
. 1770
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 8850
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 5310
60.0%
1 1770
20.0%
. 1770
20.0%

Interactions

2022-07-04T20:02:25.039668 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:08.051338 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:10.328869 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:12.354448 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:14.471042 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:16.715095 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:18.763263 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:20.792965 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:22.992553 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:25.282087 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:08.291044 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:10.569523 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:12.601359 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:14.707379 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:16.951515 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:18.994877 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:21.024434 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:23.229411 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:25.500238 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:08.519365 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:10.788183 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:12.837023 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:14.923584 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:17.174651 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:19.215188 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:21.234162 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:23.450411 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:25.744018 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:08.765712 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:11.026659 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:13.092290 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:15.170057 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:17.415701 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:19.449558 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:21.468254 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:23.684720 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:25.974488 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:08.999237 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:11.247316 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:13.323264 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:15.389169 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:17.641845 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:19.673774 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:21.694626 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:23.909076 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:26.199332 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:09.231375 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:11.473062 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:13.546214 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:15.622820 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:17.866220 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:19.899072 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:21.912575 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:24.140164 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:26.424644 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:09.646555 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:11.695533 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:13.776232 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:16.062446 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:18.089366 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:20.123815 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:22.347340 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:24.374137 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:26.638221 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:09.872628 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:11.908537 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:14.010234 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:16.267128 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:18.311836 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:20.339810 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:22.559750 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:24.591691 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:26.857144 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:10.098854 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:12.128268 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:14.239625 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:16.493002 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:18.536345 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:20.567253 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:22.775491 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:02:24.814140 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T20:02:36.850334 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T20:02:37.199491 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T20:02:37.537317 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T20:02:37.869251 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T20:02:38.121168 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T20:02:27.219913 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T20:02:27.809705 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T20:02:28.509749 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T20:02:29.022836 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.